AITopics | fitting error

Collaborating Authors

fitting error

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A robust inlier identification algorithm for point cloud registration via \mathbf{\ell_0} -minimization

Neural Information Processing SystemsDec-26-2025, 08:28:54 GMT

Correspondences in point cloud registration are prone to outliers, significantly reducing registration accuracy and highlighting the need for precise inlier identification. In this paper, we propose a robust inlier identification algorithm for point cloud registration by reformulating the conventional registration problem as an alignment error $\ell_0$-minimization problem. The $\ell_0$-minimization problem is formulated for each local set, where those local sets are built on a compatibility graph of input correspondences. To resolve the $\ell_0$-minimization, we develop a novel two-stage decoupling strategy, which first decouples the alignment error into a rotation fitting error and a translation fitting error. Second, null-space matrices are employed to decouple inlier identification from the estimation of rotation and translation respectively, thereby applying Bayesian theory to $\ell_0$-minimization problems and solving for fitting errors. Correspondences with the smallest errors are identified as inliers to generate a transformation hypothesis for each local set. The best hypothesis is selected to perform registration. We demonstrate that the proposed inlier identification algorithm is robust under high outlier ratios and noise through experiments. Extensive results on the KITTI, 3DMatch, and 3DLoMatch datasets demonstrate that our method achieves state-of-the-art performance compared to both traditional and learning-based methods in various indoor and outdoor scenes.

artificial intelligence, inlier identification algorithm, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

A robust inlier identification algorithm for point cloud registration via \mathbf{\ell_0} -minimization

Neural Information Processing SystemsMay-27-2025, 05:22:13 GMT

inlier identification algorithm, point cloud registration, robust inlier identification algorithm, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.42)

Add feedback

Towards Understanding the Optimization Mechanisms in Deep Learning

Qi, Binchuan, Gong, Wei, Li, Li

arXiv.org Artificial IntelligenceMar-29-2025

Key insights from the studies Arjevani and Field (2022); Chizat, Oyallon, and Bach (2018); Du, Zhai, P oczos, and Singh (2018); Yun, Sra, and Jadbabaie (2018) emphasize the pivotal role of over-parameterization in finding the global optimum and enhancing the generalization ability of deep neural networks (DNNs). Recent work has shown that the evolution of the trainable parameters in continuous-width DNNs during training can be captured by the neural tangent kernel (NTK) Arora, Du, Hu, Li, and Wang (2019); Du, Lee, Li, Wang, and Zhai (2018); Jacot, Gabriel, and Hongler (2018); Mohamadi, Bae, and Sutherland (2023); Wang, Li, and Sun (2023); Zou, Cao, Zhou, and Gu (2018). An alternative research direction attempts to examine the infinite-width neural network from a mean-field perspective (Chizat & Bach, 2018; Mei, Montanari, & Nguyen, 2018; Nguyen & Pham, 2023; Sirignano & Spiliopoulos, 2018). However, in practical applications, neural networks are of finite width, and under this condition, it remains unclear whether NTK theory and mean-field theory can adequately characterize the convergence properties of neural networks Seleznova and Kutyniok (2021). Therefore, the mechanisms of non-convex optimization in deep learning, and the impact of over-parameterization on model training, remain incompletely resolved.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2503.23016

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From Target Tracking to Targeting Track -- Part III: Stochastic Process Modeling and Online Learning

Li, Tiancheng, Wang, Jingyuan, Li, Guchong, Gao, Dengwei

arXiv.org Machine LearningMar-3-2025

--This is the third part of a series of studies that model the target trajectory, which describes the target state evolution over continuous time, as a sample path of a stochastic process (SP). By adopting a deterministic-stochastic decomposition framework, we decompose the learning of the trajectory SP into two sequential stages: the first fits the deterministic trend of the trajectory using a curve function of time, while the second estimates the residual stochastic component through parametric learning of either a Gaussian process (GP) or Student's-t process (StP). This leads to a Markov-free data-driven tracking approach that produces the continuous-time trajectory with minimal prior knowledge of the target dynamics. It does not only take advantage of the smooth trend of the target but also makes use of the long-term temporal correlation of both the data noise and the model fitting error . Simulations in four maneuvering target tracking scenarios have demonstrated its effectiveness and superiority in comparison with existing approaches. ARGET tracking that involves the online estimation of the trajectory of a target has been a long-standing research topic and plays a significant role in aerospace, traffic, defense, robotics, etc. [1] In essence, target tracking is more about estimating the continuous-time trajectory of the target rather than merely a finite number of point states. The continuous-time trajectory enables the acquisition of a point estimate of the state at any time in the trajectory period. However, the converse is not true. X, defined in spatio-temporal space, where X denotes the state space. Manuscript created Feb 2025; This work was supported in part by the National Natural Science Foundation of China under Grants 62422117 and 62201316 and in part by the Fundamental Research Funds for the Central Universities.

gaussian process, measurement noise, trajectory, (13 more...)

arXiv.org Machine Learning

2503.05799

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
North America > United States > South Carolina > Charleston County > Charleston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(5 more...)

Genre: Research Report (0.84)

Industry:

Education > Educational Setting > Online (0.40)
Aerospace & Defense (0.34)

Technology:

Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.84)

Add feedback

From Target Tracking to Targeting Track -- Part II: Regularized Polynomial Trajectory Optimization

Li, Tiancheng, Song, Yan, Li, Guchong, Li, Hao

arXiv.org Artificial IntelligenceFeb-22-2025

Target tracking entails the estimation of the evolution of the target state over time, namely the target trajectory. Different from the classical state space model, our series of studies, including this paper, model the collection of the target state as a stochastic process (SP) that is further decomposed into a deterministic part which represents the trend of the trajectory and a residual SP representing the residual fitting error. Subsequently, the tracking problem is formulated as a learning problem regarding the trajectory SP for which a key part is to estimate a trajectory FoT (T-FoT) best fitting the measurements in time series. For this purpose, we consider the polynomial T-FoT and address the regularized polynomial T-FoT optimization employing two distinct regularization strategies seeking trade-off between the accuracy and simplicity. One limits the order of the polynomial and then the best choice is determined by grid searching in a narrow, bounded range while the other adopts $\ell_0$ norm regularization for which the hybrid Newton solver is employed. Simulation results obtained in both single and multiple maneuvering target scenarios demonstrate the effectiveness of our approaches.

algorithm, optimization, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2502.16121

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(6 more...)

Genre: Research Report (0.84)

Industry: Education (0.66)

Technology:

Information Technology > Communications > Networks > Sensor Networks (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
(2 more...)

Add feedback

Establishing Task Scaling Laws via Compute-Efficient Model Ladders

Bhagia, Akshita, Liu, Jiacheng, Wettig, Alexander, Heineman, David, Tafjord, Oyvind, Jha, Ananya Harsh, Soldaini, Luca, Smith, Noah A., Groeneveld, Dirk, Koh, Pang Wei, Dodge, Jesse, Hajishirzi, Hannaneh

arXiv.org Artificial IntelligenceDec-5-2024

We develop task scaling laws and model ladders to predict the individual task performance of pretrained language models (LMs) in the overtrained setting. Standard power laws for language modeling loss cannot accurately model task performance. Therefore, we leverage a two-step prediction approach: first use model and data size to predict a task-specific loss, and then use this task loss to predict task performance. We train a set of small-scale "ladder" models, collect data points to fit the parameterized functions of the two prediction steps, and make predictions for two target models: a 7B model trained to 4T tokens and a 13B model trained to 5T tokens. Training the ladder models only costs 1% of the compute used for the target models. On four multiple-choice tasks written in ranked classification format, we can predict the accuracy of both target models within 2 points of absolute error. We have higher prediction error on four other tasks (average absolute error 6.9) and find that these are often tasks with higher variance in task metrics. We also find that using less compute to train fewer ladder models tends to deteriorate predictions. Finally, we empirically show that our design choices and the two-step approach lead to superior performance in establishing scaling laws.

accuracy, ladder model, task loss, (15 more...)

arXiv.org Artificial Intelligence

2412.04403

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.34)

Add feedback

Efficient Diffusion as Low Light Enhancer

Lan, Guanzhou, Ma, Qianli, Yang, Yuqi, Wang, Zhigang, Wang, Dong, Li, Xuelong, Zhao, Bin

arXiv.org Artificial IntelligenceNov-21-2024

The computational burden of the iterative sampling process remains a major challenge in diffusion-based Low-Light Image Enhancement (LLIE). Current acceleration methods, whether training-based or training-free, often lead to significant performance degradation, highlighting the trade-off between performance and efficiency. In this paper, we identify two primary factors contributing to performance degradation: fitting errors and the inference gap. Our key insight is that fitting errors can be mitigated by linearly extrapolating the incorrect score functions, while the inference gap can be reduced by shifting the Gaussian flow to a reflectance-aware residual space. Based on the above insights, we design Reflectance-Aware Trajectory Refinement (RATR) module, a simple yet effective module to refine the teacher trajectory using the reflectance component of images. Following this, we introduce \textbf{Re}flectance-aware \textbf{D}iffusion with \textbf{Di}stilled \textbf{T}rajectory (\textbf{ReDDiT}), an efficient and flexible distillation framework tailored for LLIE. Our framework achieves comparable performance to previous diffusion-based methods with redundant steps in just 2 steps while establishing new state-of-the-art (SOTA) results with 8 or 4 steps. Comprehensive experimental evaluations on 10 benchmark datasets validate the effectiveness of our method, consistently outperforming existing SOTA methods.

diffusion model, proceedings, trajectory, (11 more...)

arXiv.org Artificial Intelligence

2410.12346

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Industry:

Media (0.51)
Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration

Wu, Kangxi, Pang, Liang, Shen, Huawei, Cheng, Xueqi

arXiv.org Artificial IntelligenceNov-19-2024

The black-box nature of large language models (LLMs) poses challenges in interpreting results, impacting issues such as data intellectual property protection and hallucination tracing. Training data attribution (TDA) methods are considered effective solutions to address these challenges. Most recent TDA methods rely on influence functions, assuming the model achieves minimized empirical risk. However, achieving this criterion is difficult, and sourcing accuracy can be compromised by fitting errors during model training. In this paper, we introduce a novel TDA method called Debias and Denoise Attribution (DDA), which enhances influence functions by addressing fitting errors. Specifically, the debias strategy seeks to improve the performance of influence functions by eliminating the knowledge bias present in the base model before fine-tuning, while the denoise strategy aims to reduce discrepancies in influence scores arising from varying degrees of fitting during the training process through smoothing techniques. Experimental results demonstrate that our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%. Moreover, DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.

influence function, influence score, training data, (13 more...)

arXiv.org Artificial Intelligence

2410.01285

Country:

Asia > China (0.06)
Oceania > Australia (0.05)
Europe > United Kingdom > Wales (0.05)
(11 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Sports > Cricket (0.93)
Law (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Subspace-Constrained Quadratic Matrix Factorization: Algorithm and Applications

Zhai, Zheng, Li, Xiaohui

arXiv.org Artificial IntelligenceNov-7-2024

Matrix Factorization has emerged as a widely adopted framework for modeling data exhibiting low-rank structures. To address challenges in manifold learning, this paper presents a subspace-constrained quadratic matrix factorization model. The model is designed to jointly learn key low-dimensional structures, including the tangent space, the normal subspace, and the quadratic form that links the tangent space to a low-dimensional representation. We solve the proposed factorization model using an alternating minimization method, involving an in-depth investigation of nonlinear regression and projection subproblems. Theoretical properties of the quadratic projection problem and convergence characteristics of the alternating strategy are also investigated. To validate our approach, we conduct numerical experiments on synthetic and real-world datasets. Results demonstrate that our model outperforms existing methods, highlighting its robustness and efficacy in capturing core low-dimensional structures.

factorization, matrix factorization, tangent space, (14 more...)

arXiv.org Artificial Intelligence

2411.04717

Country:

Asia > China > Shandong Province > Yantai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Guangdong Province > Zhuhai (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Overview (0.92)
Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Error Bounds of Supervised Classification from Information-Theoretic Perspective

Qi, Binchuan, Gong, Wei, Li, Li

arXiv.org Artificial IntelligenceJun-27-2024

There remains a list of unanswered research questions on deep learning (DL), including the remarkable generalization power of overparametrized neural networks, the efficient optimization performance despite the non-convexity, and the mechanisms behind flat minima in generalization. In this paper, we adopt an information-theoretic perspective to explore the theoretical foundations of supervised classification using deep neural networks (DNNs). Our analysis introduces the concepts of fitting error and model risk, which, together with generalization error, constitute an upper bound on the expected risk. We demonstrate that the generalization errors are bounded by the complexity, influenced by both the smoothness of distribution and the sample size. Consequently, task complexity serves as a reliable indicator of the dataset's quality, guiding the setting of regularization hyperparameters. Furthermore, the derived upper bound fitting error links the back-propagated gradient, Neural Tangent Kernel (NTK), and the model's parameter count with the fitting error. Utilizing the triangle inequality, we establish an upper bound on the expected risk. This bound offers valuable insights into the effects of overparameterization, non-convex optimization, and the flat minima in DNNs.Finally, empirical verification confirms a significant positive correlation between the derived theoretical bounds and the practical expected risk, confirming the practical relevance of the theoretical findings.

complexity, fitting error, generalization error, (16 more...)

arXiv.org Artificial Intelligence

2406.04567

Country:

North America > United States (0.14)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback